Overview

Dataset statistics

Number of variables21
Number of observations10866
Missing cells13434
Missing cells (%)5.9%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory1.7 MiB
Average record size in memory168.0 B

Variable types

Numeric10
Categorical11

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
imdb_id has a high cardinality: 10855 distinct values High cardinality
original_title has a high cardinality: 10571 distinct values High cardinality
cast has a high cardinality: 10719 distinct values High cardinality
homepage has a high cardinality: 2896 distinct values High cardinality
director has a high cardinality: 5067 distinct values High cardinality
tagline has a high cardinality: 7997 distinct values High cardinality
keywords has a high cardinality: 8804 distinct values High cardinality
overview has a high cardinality: 10847 distinct values High cardinality
genres has a high cardinality: 2039 distinct values High cardinality
production_companies has a high cardinality: 7445 distinct values High cardinality
release_date has a high cardinality: 5909 distinct values High cardinality
id is highly correlated with release_yearHigh correlation
popularity is highly correlated with revenue and 1 other fieldsHigh correlation
budget is highly correlated with revenue and 3 other fieldsHigh correlation
revenue is highly correlated with popularity and 4 other fieldsHigh correlation
vote_count is highly correlated with popularity and 4 other fieldsHigh correlation
release_year is highly correlated with idHigh correlation
budget_adj is highly correlated with budget and 3 other fieldsHigh correlation
revenue_adj is highly correlated with budget and 3 other fieldsHigh correlation
homepage has 7930 (73.0%) missing values Missing
tagline has 2824 (26.0%) missing values Missing
keywords has 1493 (13.7%) missing values Missing
production_companies has 1030 (9.5%) missing values Missing
imdb_id is uniformly distributed Uniform
original_title is uniformly distributed Uniform
cast is uniformly distributed Uniform
homepage is uniformly distributed Uniform
tagline is uniformly distributed Uniform
overview is uniformly distributed Uniform
budget has 5696 (52.4%) zeros Zeros
revenue has 6016 (55.4%) zeros Zeros
budget_adj has 5696 (52.4%) zeros Zeros
revenue_adj has 6016 (55.4%) zeros Zeros

Reproduction

Analysis started2022-10-07 23:37:58.950256
Analysis finished2022-10-07 23:38:20.466536
Duration21.52 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10865
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66064.17743
Minimum5
Maximum417859
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:20.623598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile1221
Q110596.25
median20669
Q375610
95-th percentile288556
Maximum417859
Range417854
Interquartile range (IQR)65013.75

Descriptive statistics

Standard deviation92130.13656
Coefficient of variation (CV)1.394555115
Kurtosis1.781869015
Mean66064.17743
Median Absolute Deviation (MAD)15121.5
Skewness1.732293939
Sum717853352
Variance8487962063
MonotonicityNot monotonic
2022-10-07T16:38:20.778832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
421942
 
< 0.1%
20471
 
< 0.1%
2512321
 
< 0.1%
198191
 
< 0.1%
2799141
 
< 0.1%
618721
 
< 0.1%
2717181
 
< 0.1%
136681
 
< 0.1%
1017311
 
< 0.1%
13781
 
< 0.1%
Other values (10855)10855
99.9%
ValueCountFrequency (%)
51
< 0.1%
61
< 0.1%
111
< 0.1%
121
< 0.1%
131
< 0.1%
141
< 0.1%
161
< 0.1%
171
< 0.1%
181
< 0.1%
201
< 0.1%
ValueCountFrequency (%)
4178591
< 0.1%
4144191
< 0.1%
4096961
< 0.1%
3958831
< 0.1%
3955601
< 0.1%
3865011
< 0.1%
3825171
< 0.1%
3783731
< 0.1%
3768231
< 0.1%
3744301
< 0.1%

imdb_id
Categorical

HIGH CARDINALITY
UNIFORM

Distinct10855
Distinct (%)> 99.9%
Missing10
Missing (%)0.1%
Memory size85.0 KiB
tt0411951
 
2
tt0116277
 
1
tt0433442
 
1
tt3577624
 
1
tt0106873
 
1
Other values (10850)
10850 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters97704
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10854 ?
Unique (%)> 99.9%

Sample

1st rowtt0369610
2nd rowtt1392190
3rd rowtt2908446
4th rowtt2488496
5th rowtt2820852

Common Values

ValueCountFrequency (%)
tt04119512
 
< 0.1%
tt01162771
 
< 0.1%
tt04334421
 
< 0.1%
tt35776241
 
< 0.1%
tt01068731
 
< 0.1%
tt13239251
 
< 0.1%
tt02790791
 
< 0.1%
tt03745631
 
< 0.1%
tt04232941
 
< 0.1%
tt02665431
 
< 0.1%
Other values (10845)10845
99.8%
(Missing)10
 
0.1%

Length

2022-10-07T16:38:20.934831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tt04119512
 
< 0.1%
tt23589251
 
< 0.1%
tt21948261
 
< 0.1%
tt02389481
 
< 0.1%
tt01164831
 
< 0.1%
tt23347331
 
< 0.1%
tt01346301
 
< 0.1%
tt01768921
 
< 0.1%
tt01125081
 
< 0.1%
tt00905551
 
< 0.1%
Other values (10845)10845
99.9%

Most occurring characters

ValueCountFrequency (%)
t21712
22.2%
015108
15.5%
110229
10.5%
27559
 
7.7%
36667
 
6.8%
46546
 
6.7%
86328
 
6.5%
76087
 
6.2%
96077
 
6.2%
65822
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number75992
77.8%
Lowercase Letter21712
 
22.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
015108
19.9%
110229
13.5%
27559
9.9%
36667
8.8%
46546
8.6%
86328
8.3%
76087
8.0%
96077
8.0%
65822
 
7.7%
55569
 
7.3%
Lowercase Letter
ValueCountFrequency (%)
t21712
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common75992
77.8%
Latin21712
 
22.2%

Most frequent character per script

Common
ValueCountFrequency (%)
015108
19.9%
110229
13.5%
27559
9.9%
36667
8.8%
46546
8.6%
86328
8.3%
76087
8.0%
96077
8.0%
65822
 
7.7%
55569
 
7.3%
Latin
ValueCountFrequency (%)
t21712
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII97704
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t21712
22.2%
015108
15.5%
110229
10.5%
27559
 
7.7%
36667
 
6.8%
46546
 
6.7%
86328
 
6.5%
76087
 
6.2%
96077
 
6.2%
65822
 
6.0%

popularity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10814
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.646440952
Minimum6.5 × 10-5
Maximum32.985763
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:21.076230image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6.5 × 10-5
5-th percentile0.06425225
Q10.20758275
median0.3838555
Q30.713817
95-th percentile2.04660175
Maximum32.985763
Range32.985698
Interquartile range (IQR)0.50623425

Descriptive statistics

Standard deviation1.000184934
Coefficient of variation (CV)1.54721778
Kurtosis210.9981321
Mean0.646440952
Median Absolute Deviation (MAD)0.2153785
Skewness9.876331256
Sum7024.227384
Variance1.000369903
MonotonicityNot monotonic
2022-10-07T16:38:21.249164image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1093052
 
< 0.1%
0.1140272
 
< 0.1%
0.1261822
 
< 0.1%
0.2479262
 
< 0.1%
0.4102352
 
< 0.1%
0.1225432
 
< 0.1%
0.3586942
 
< 0.1%
0.1555192
 
< 0.1%
0.9842562
 
< 0.1%
0.4685522
 
< 0.1%
Other values (10804)10846
99.8%
ValueCountFrequency (%)
6.5 × 10-51
< 0.1%
0.0001881
< 0.1%
0.000621
< 0.1%
0.0009731
< 0.1%
0.0011151
< 0.1%
0.0011171
< 0.1%
0.0013151
< 0.1%
0.0013171
< 0.1%
0.0013491
< 0.1%
0.0013721
< 0.1%
ValueCountFrequency (%)
32.9857631
< 0.1%
28.4199361
< 0.1%
24.9491341
< 0.1%
14.3112051
< 0.1%
13.1125071
< 0.1%
12.9710271
< 0.1%
12.0379331
< 0.1%
11.4227511
< 0.1%
11.1731041
< 0.1%
10.7390091
< 0.1%

budget
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct557
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14625701.09
Minimum0
Maximum425000000
Zeros5696
Zeros (%)52.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:21.404532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q315000000
95-th percentile75000000
Maximum425000000
Range425000000
Interquartile range (IQR)15000000

Descriptive statistics

Standard deviation30913213.83
Coefficient of variation (CV)2.1136227
Kurtosis19.26943617
Mean14625701.09
Median Absolute Deviation (MAD)0
Skewness3.717237057
Sum1.589228681 × 1011
Variance9.556267894 × 1014
MonotonicityNot monotonic
2022-10-07T16:38:21.579448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05696
52.4%
20000000190
 
1.7%
15000000183
 
1.7%
25000000178
 
1.6%
10000000176
 
1.6%
30000000165
 
1.5%
5000000141
 
1.3%
40000000134
 
1.2%
35000000128
 
1.2%
12000000120
 
1.1%
Other values (547)3755
34.6%
ValueCountFrequency (%)
05696
52.4%
14
 
< 0.1%
21
 
< 0.1%
33
 
< 0.1%
51
 
< 0.1%
61
 
< 0.1%
83
 
< 0.1%
106
 
0.1%
111
 
< 0.1%
122
 
< 0.1%
ValueCountFrequency (%)
4250000001
 
< 0.1%
3800000001
 
< 0.1%
3000000001
 
< 0.1%
2800000001
 
< 0.1%
2700000001
 
< 0.1%
2600000002
 
< 0.1%
2580000001
 
< 0.1%
2550000001
 
< 0.1%
2500000007
0.1%
2450000001
 
< 0.1%

revenue
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct4702
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39823319.79
Minimum0
Maximum2781505847
Zeros6016
Zeros (%)55.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:21.873381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q324000000
95-th percentile213672164.5
Maximum2781505847
Range2781505847
Interquartile range (IQR)24000000

Descriptive statistics

Standard deviation117003486.6
Coefficient of variation (CV)2.938064611
Kurtosis73.16848928
Mean39823319.79
Median Absolute Deviation (MAD)0
Skewness6.658397231
Sum4.327201929 × 1011
Variance1.368981587 × 1016
MonotonicityNot monotonic
2022-10-07T16:38:22.027994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06016
55.4%
1200000010
 
0.1%
100000008
 
0.1%
110000007
 
0.1%
60000006
 
0.1%
50000006
 
0.1%
20000006
 
0.1%
130000005
 
< 0.1%
200000005
 
< 0.1%
140000005
 
< 0.1%
Other values (4692)4792
44.1%
ValueCountFrequency (%)
06016
55.4%
22
 
< 0.1%
33
 
< 0.1%
52
 
< 0.1%
62
 
< 0.1%
92
 
< 0.1%
101
 
< 0.1%
113
 
< 0.1%
121
 
< 0.1%
132
 
< 0.1%
ValueCountFrequency (%)
27815058471
< 0.1%
20681782251
< 0.1%
18450341881
< 0.1%
15195579101
< 0.1%
15135288101
< 0.1%
15062493601
< 0.1%
14050357671
< 0.1%
13278178221
< 0.1%
12742190091
< 0.1%
12154399941
< 0.1%

original_title
Categorical

HIGH CARDINALITY
UNIFORM

Distinct10571
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Memory size85.0 KiB
Hamlet
 
4
Beauty and the Beast
 
3
Shelter
 
3
Julia
 
3
Annie
 
3
Other values (10566)
10850 

Length

Max length104
Median length70
Mean length16.00220872
Min length1

Characters and Unicode

Total characters173880
Distinct characters164
Distinct categories19 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10294 ?
Unique (%)94.7%

Sample

1st rowJurassic World
2nd rowMad Max: Fury Road
3rd rowInsurgent
4th rowStar Wars: The Force Awakens
5th rowFurious 7

Common Values

ValueCountFrequency (%)
Hamlet4
 
< 0.1%
Beauty and the Beast3
 
< 0.1%
Shelter3
 
< 0.1%
Julia3
 
< 0.1%
Annie3
 
< 0.1%
Alice in Wonderland3
 
< 0.1%
Frankenstein3
 
< 0.1%
Wuthering Heights3
 
< 0.1%
The Black Hole3
 
< 0.1%
Hercules3
 
< 0.1%
Other values (10561)10835
99.7%

Length

2022-10-07T16:38:22.216557image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the3279
 
10.6%
of969
 
3.1%
a386
 
1.2%
in327
 
1.1%
and317
 
1.0%
to227
 
0.7%
2226
 
0.7%
211
 
0.7%
man148
 
0.5%
for113
 
0.4%
Other values (8859)24843
80.0%

Most occurring characters

ValueCountFrequency (%)
20178
 
11.6%
e17617
 
10.1%
a10758
 
6.2%
o10239
 
5.9%
r9356
 
5.4%
n9343
 
5.4%
i9062
 
5.2%
t8414
 
4.8%
s6857
 
3.9%
h6463
 
3.7%
Other values (154)65593
37.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter122010
70.2%
Uppercase Letter27184
 
15.6%
Space Separator20195
 
11.6%
Other Punctuation2612
 
1.5%
Decimal Number1140
 
0.7%
Dash Punctuation212
 
0.1%
Modifier Symbol114
 
0.1%
Other Symbol101
 
0.1%
Currency Symbol78
 
< 0.1%
Other Number71
 
< 0.1%
Other values (9)163
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e17617
14.4%
a10758
 
8.8%
o10239
 
8.4%
r9356
 
7.7%
n9343
 
7.7%
i9062
 
7.4%
t8414
 
6.9%
s6857
 
5.6%
h6463
 
5.3%
l5845
 
4.8%
Other values (35)28056
23.0%
Uppercase Letter
ValueCountFrequency (%)
T3616
 
13.3%
S2228
 
8.2%
B1815
 
6.7%
M1783
 
6.6%
C1613
 
5.9%
D1579
 
5.8%
A1541
 
5.7%
L1281
 
4.7%
H1244
 
4.6%
P1197
 
4.4%
Other values (27)9287
34.2%
Other Punctuation
ValueCountFrequency (%)
:1097
42.0%
'560
21.4%
.368
 
14.1%
,154
 
5.9%
&153
 
5.9%
!123
 
4.7%
?41
 
1.6%
/29
 
1.1%
¡14
 
0.5%
12
 
0.5%
Other values (14)61
 
2.3%
Decimal Number
ValueCountFrequency (%)
2327
28.7%
3172
15.1%
1172
15.1%
0167
14.6%
481
 
7.1%
567
 
5.9%
948
 
4.2%
742
 
3.7%
635
 
3.1%
829
 
2.5%
Currency Symbol
ValueCountFrequency (%)
29
37.2%
¤18
23.1%
¢12
15.4%
£10
 
12.8%
¥5
 
6.4%
$4
 
5.1%
Other Number
ValueCountFrequency (%)
¹22
31.0%
³14
19.7%
¼14
19.7%
½9
12.7%
²7
 
9.9%
¾5
 
7.0%
Modifier Symbol
ValueCountFrequency (%)
¸63
55.3%
¨24
 
21.1%
´14
 
12.3%
˜9
 
7.9%
¯4
 
3.5%
Other Symbol
ValueCountFrequency (%)
©60
59.4%
20
 
19.8%
°11
 
10.9%
¦7
 
6.9%
®3
 
3.0%
Math Symbol
ValueCountFrequency (%)
±9
36.0%
¬8
32.0%
×5
20.0%
+3
 
12.0%
Final Punctuation
ValueCountFrequency (%)
»8
38.1%
7
33.3%
3
 
14.3%
3
 
14.3%
Initial Punctuation
ValueCountFrequency (%)
7
29.2%
7
29.2%
«5
20.8%
5
20.8%
Dash Punctuation
ValueCountFrequency (%)
-206
97.2%
5
 
2.4%
1
 
0.5%
Open Punctuation
ValueCountFrequency (%)
(16
43.2%
14
37.8%
7
18.9%
Space Separator
ValueCountFrequency (%)
20178
99.9%
 17
 
0.1%
Other Letter
ValueCountFrequency (%)
ª13
61.9%
º8
38.1%
Close Punctuation
ValueCountFrequency (%)
)16
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ11
100.0%
Format
ValueCountFrequency (%)
­6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin149210
85.8%
Common24670
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e17617
 
11.8%
a10758
 
7.2%
o10239
 
6.9%
r9356
 
6.3%
n9343
 
6.3%
i9062
 
6.1%
t8414
 
5.6%
s6857
 
4.6%
h6463
 
4.3%
l5845
 
3.9%
Other values (73)55256
37.0%
Common
ValueCountFrequency (%)
20178
81.8%
:1097
 
4.4%
'560
 
2.3%
.368
 
1.5%
2327
 
1.3%
-206
 
0.8%
3172
 
0.7%
1172
 
0.7%
0167
 
0.7%
,154
 
0.6%
Other values (71)1269
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII172797
99.4%
None915
 
0.5%
Punctuation99
 
0.1%
Currency Symbols29
 
< 0.1%
Letterlike Symbols20
 
< 0.1%
Modifier Letters20
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20178
 
11.7%
e17617
 
10.2%
a10758
 
6.2%
o10239
 
5.9%
r9356
 
5.4%
n9343
 
5.4%
i9062
 
5.2%
t8414
 
4.9%
s6857
 
4.0%
h6463
 
3.7%
Other values (73)64510
37.3%
None
ValueCountFrequency (%)
Ã162
 
17.7%
¸63
 
6.9%
©60
 
6.6%
à50
 
5.5%
ì31
 
3.4%
¨24
 
2.6%
¹22
 
2.4%
å21
 
2.3%
¤18
 
2.0%
 17
 
1.9%
Other values (52)447
48.9%
Currency Symbols
ValueCountFrequency (%)
29
100.0%
Letterlike Symbols
ValueCountFrequency (%)
20
100.0%
Punctuation
ValueCountFrequency (%)
14
14.1%
12
12.1%
11
11.1%
8
8.1%
7
7.1%
7
7.1%
7
7.1%
7
7.1%
7
7.1%
5
 
5.1%
Other values (5)14
14.1%
Modifier Letters
ValueCountFrequency (%)
ˆ11
55.0%
˜9
45.0%

cast
Categorical

HIGH CARDINALITY
UNIFORM

Distinct10719
Distinct (%)99.3%
Missing76
Missing (%)0.7%
Memory size85.0 KiB
Louis C.K.
 
6
William Shatner|Leonard Nimoy|DeForest Kelley|James Doohan|George Takei
 
5
Bill Burr
 
4
George Carlin
 
3
Chris Wedge
 
3
Other values (10714)
10769 

Length

Max length110
Median length94
Mean length67.87256719
Min length7

Characters and Unicode

Total characters732345
Distinct characters126
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10666 ?
Unique (%)98.9%

Sample

1st rowChris Pratt|Bryce Dallas Howard|Irrfan Khan|Vincent D'Onofrio|Nick Robinson
2nd rowTom Hardy|Charlize Theron|Hugh Keays-Byrne|Nicholas Hoult|Josh Helman
3rd rowShailene Woodley|Theo James|Kate Winslet|Ansel Elgort|Miles Teller
4th rowHarrison Ford|Mark Hamill|Carrie Fisher|Adam Driver|Daisy Ridley
5th rowVin Diesel|Paul Walker|Jason Statham|Michelle Rodriguez|Dwayne Johnson

Common Values

ValueCountFrequency (%)
Louis C.K.6
 
0.1%
William Shatner|Leonard Nimoy|DeForest Kelley|James Doohan|George Takei5
 
< 0.1%
Bill Burr4
 
< 0.1%
George Carlin3
 
< 0.1%
Chris Wedge3
 
< 0.1%
Sylvester Stallone|Talia Shire|Burt Young|Carl Weathers|Burgess Meredith3
 
< 0.1%
Pierre Coffin3
 
< 0.1%
Zac Efron|Vanessa Hudgens|Ashley Tisdale|Lucas Grabeel|Corbin Bleu3
 
< 0.1%
Jennifer Lawrence|Josh Hutcherson|Liam Hemsworth|Woody Harrelson|Elizabeth Banks3
 
< 0.1%
Jim Jefferies3
 
< 0.1%
Other values (10709)10754
99.0%
(Missing)76
 
0.7%

Length

2022-10-07T16:38:22.404614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
michael285
 
0.4%
john231
 
0.3%
de209
 
0.3%
james177
 
0.3%
robert158
 
0.2%
tom154
 
0.2%
lee139
 
0.2%
jason132
 
0.2%
van131
 
0.2%
david128
 
0.2%
Other values (46644)64967
97.4%

Most occurring characters

ValueCountFrequency (%)
e65672
 
9.0%
a61498
 
8.4%
55922
 
7.6%
n50507
 
6.9%
r44415
 
6.1%
i42812
 
5.8%
|41783
 
5.7%
o37725
 
5.2%
l35122
 
4.8%
t24677
 
3.4%
Other values (116)272212
37.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter517877
70.7%
Uppercase Letter112598
 
15.4%
Space Separator55935
 
7.6%
Math Symbol41835
 
5.7%
Other Punctuation2123
 
0.3%
Dash Punctuation811
 
0.1%
Other Symbol591
 
0.1%
Format130
 
< 0.1%
Modifier Symbol119
 
< 0.1%
Other Number107
 
< 0.1%
Other values (6)219
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e65672
12.7%
a61498
11.9%
n50507
9.8%
r44415
 
8.6%
i42812
 
8.3%
o37725
 
7.3%
l35122
 
6.8%
t24677
 
4.8%
s24214
 
4.7%
h19623
 
3.8%
Other values (24)111612
21.6%
Uppercase Letter
ValueCountFrequency (%)
M9756
 
8.7%
J9025
 
8.0%
S8852
 
7.9%
C8638
 
7.7%
B8209
 
7.3%
D7012
 
6.2%
R6503
 
5.8%
A6314
 
5.6%
L5342
 
4.7%
H5002
 
4.4%
Other values (24)37945
33.7%
Other Punctuation
ValueCountFrequency (%)
.1300
61.2%
'481
 
22.7%
¡147
 
6.9%
,49
 
2.3%
43
 
2.0%
§36
 
1.7%
28
 
1.3%
16
 
0.8%
10
 
0.5%
"6
 
0.3%
Other values (5)7
 
0.3%
Other Number
ValueCountFrequency (%)
³50
46.7%
¼49
45.8%
²4
 
3.7%
¾2
 
1.9%
¹1
 
0.9%
½1
 
0.9%
Other Symbol
ValueCountFrequency (%)
©567
95.9%
11
 
1.9%
®9
 
1.5%
°3
 
0.5%
¦1
 
0.2%
Modifier Symbol
ValueCountFrequency (%)
¨53
44.5%
¯28
23.5%
¸24
20.2%
´13
 
10.9%
˜1
 
0.8%
Currency Symbol
ValueCountFrequency (%)
¥39
54.2%
¤20
27.8%
£6
 
8.3%
4
 
5.6%
¢3
 
4.2%
Initial Punctuation
ValueCountFrequency (%)
«56
87.5%
6
 
9.4%
1
 
1.6%
1
 
1.6%
Math Symbol
ValueCountFrequency (%)
|41783
99.9%
±50
 
0.1%
¬2
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-804
99.1%
6
 
0.7%
1
 
0.1%
Decimal Number
ValueCountFrequency (%)
012
48.0%
512
48.0%
21
 
4.0%
Space Separator
ValueCountFrequency (%)
55922
> 99.9%
 13
 
< 0.1%
Other Letter
ValueCountFrequency (%)
º30
93.8%
ª2
 
6.2%
Final Punctuation
ValueCountFrequency (%)
»11
73.3%
4
 
26.7%
Open Punctuation
ValueCountFrequency (%)
7
63.6%
4
36.4%
Format
ValueCountFrequency (%)
­130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin630506
86.1%
Common101839
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e65672
 
10.4%
a61498
 
9.8%
n50507
 
8.0%
r44415
 
7.0%
i42812
 
6.8%
o37725
 
6.0%
l35122
 
5.6%
t24677
 
3.9%
s24214
 
3.8%
h19623
 
3.1%
Other values (59)224241
35.6%
Common
ValueCountFrequency (%)
55922
54.9%
|41783
41.0%
.1300
 
1.3%
-804
 
0.8%
©567
 
0.6%
'481
 
0.5%
¡147
 
0.1%
­130
 
0.1%
«56
 
0.1%
¨53
 
0.1%
Other values (47)596
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII729305
99.6%
None2939
 
0.4%
Punctuation85
 
< 0.1%
Letterlike Symbols11
 
< 0.1%
Currency Symbols4
 
< 0.1%
Modifier Letters1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e65672
 
9.0%
a61498
 
8.4%
55922
 
7.7%
n50507
 
6.9%
r44415
 
6.1%
i42812
 
5.9%
|41783
 
5.7%
o37725
 
5.2%
l35122
 
4.8%
t24677
 
3.4%
Other values (54)269172
36.9%
None
ValueCountFrequency (%)
Ã1409
47.9%
©567
19.3%
¡147
 
5.0%
­130
 
4.4%
«56
 
1.9%
¨53
 
1.8%
³50
 
1.7%
±50
 
1.7%
¼49
 
1.7%
Ä48
 
1.6%
Other values (37)380
 
12.9%
Punctuation
ValueCountFrequency (%)
28
32.9%
16
18.8%
10
 
11.8%
7
 
8.2%
6
 
7.1%
6
 
7.1%
4
 
4.7%
4
 
4.7%
1
 
1.2%
1
 
1.2%
Other values (2)2
 
2.4%
Letterlike Symbols
ValueCountFrequency (%)
11
100.0%
Currency Symbols
ValueCountFrequency (%)
4
100.0%
Modifier Letters
ValueCountFrequency (%)
˜1
100.0%

homepage
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct2896
Distinct (%)98.6%
Missing7930
Missing (%)73.0%
Memory size85.0 KiB
http://www.missionimpossible.com/
 
4
http://www.thehungergames.movie/
 
4
http://phantasm.com
 
4
http://www.georgecarlin.com
 
3
http://www.americanreunionmovie.com/
 
3
Other values (2891)
2918 

Length

Max length242
Median length89
Mean length37.14611717
Min length13

Characters and Unicode

Total characters109061
Distinct characters83
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2868 ?
Unique (%)97.7%

Sample

1st rowhttp://www.jurassicworld.com/
2nd rowhttp://www.madmaxmovie.com/
3rd rowhttp://www.thedivergentseries.movie/#insurgent
4th rowhttp://www.starwars.com/films/star-wars-episode-vii
5th rowhttp://www.furious7.com/

Common Values

ValueCountFrequency (%)
http://www.missionimpossible.com/4
 
< 0.1%
http://www.thehungergames.movie/4
 
< 0.1%
http://phantasm.com4
 
< 0.1%
http://www.georgecarlin.com3
 
< 0.1%
http://www.americanreunionmovie.com/3
 
< 0.1%
http://www.kungfupanda.com/3
 
< 0.1%
http://www.thehobbit.com/3
 
< 0.1%
http://www.jeffdunham.com3
 
< 0.1%
http://www.transformersmovie.com/3
 
< 0.1%
http://www.howtotrainyourdragon.com/2
 
< 0.1%
Other values (2886)2904
 
26.7%
(Missing)7930
73.0%

Length

2022-10-07T16:38:22.606290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
http://www.missionimpossible.com5
 
0.2%
http://www.thehungergames.movie4
 
0.1%
http://phantasm.com4
 
0.1%
http://www.transformersmovie.com4
 
0.1%
http://www.kungfupanda.com4
 
0.1%
http://www.thehobbit.com3
 
0.1%
http://www.jeffdunham.com3
 
0.1%
http://www.lordoftherings.net3
 
0.1%
http://www.americanreunionmovie.com3
 
0.1%
http://www.georgecarlin.com3
 
0.1%
Other values (2878)2900
98.8%

Most occurring characters

ValueCountFrequency (%)
t9865
 
9.0%
/9843
 
9.0%
e7545
 
6.9%
w7528
 
6.9%
o7522
 
6.9%
m6035
 
5.5%
.5844
 
5.4%
h5380
 
4.9%
i5336
 
4.9%
c4475
 
4.1%
Other values (73)39688
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter86844
79.6%
Other Punctuation18787
 
17.2%
Dash Punctuation1265
 
1.2%
Decimal Number1247
 
1.1%
Uppercase Letter648
 
0.6%
Connector Punctuation176
 
0.2%
Math Symbol81
 
0.1%
Open Punctuation6
 
< 0.1%
Close Punctuation6
 
< 0.1%
Currency Symbol1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t9865
 
11.4%
e7545
 
8.7%
w7528
 
8.7%
o7522
 
8.7%
m6035
 
6.9%
h5380
 
6.2%
i5336
 
6.1%
c4475
 
5.2%
p4174
 
4.8%
a3960
 
4.6%
Other values (17)25024
28.8%
Uppercase Letter
ValueCountFrequency (%)
T61
 
9.4%
S59
 
9.1%
M54
 
8.3%
A46
 
7.1%
E40
 
6.2%
F36
 
5.6%
B35
 
5.4%
D34
 
5.2%
H28
 
4.3%
C27
 
4.2%
Other values (17)228
35.2%
Other Punctuation
ValueCountFrequency (%)
/9843
52.4%
.5844
31.1%
:2941
 
15.7%
?59
 
0.3%
#40
 
0.2%
%26
 
0.1%
&18
 
0.1%
!8
 
< 0.1%
,6
 
< 0.1%
'2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2240
19.2%
0199
16.0%
1196
15.7%
3141
11.3%
491
 
7.3%
882
 
6.6%
978
 
6.3%
775
 
6.0%
574
 
5.9%
671
 
5.7%
Math Symbol
ValueCountFrequency (%)
=76
93.8%
+5
 
6.2%
Open Punctuation
ValueCountFrequency (%)
(5
83.3%
{1
 
16.7%
Close Punctuation
ValueCountFrequency (%)
)5
83.3%
}1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
-1265
100.0%
Connector Punctuation
ValueCountFrequency (%)
_176
100.0%
Currency Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin87492
80.2%
Common21569
 
19.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
t9865
 
11.3%
e7545
 
8.6%
w7528
 
8.6%
o7522
 
8.6%
m6035
 
6.9%
h5380
 
6.1%
i5336
 
6.1%
c4475
 
5.1%
p4174
 
4.8%
a3960
 
4.5%
Other values (44)25672
29.3%
Common
ValueCountFrequency (%)
/9843
45.6%
.5844
27.1%
:2941
 
13.6%
-1265
 
5.9%
2240
 
1.1%
0199
 
0.9%
1196
 
0.9%
_176
 
0.8%
3141
 
0.7%
491
 
0.4%
Other values (19)633
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII109058
> 99.9%
None2
 
< 0.1%
Currency Symbols1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t9865
 
9.0%
/9843
 
9.0%
e7545
 
6.9%
w7528
 
6.9%
o7522
 
6.9%
m6035
 
5.5%
.5844
 
5.4%
h5380
 
4.9%
i5336
 
4.9%
c4475
 
4.1%
Other values (70)39685
36.4%
Currency Symbols
ValueCountFrequency (%)
1
100.0%
None
ValueCountFrequency (%)
Ž1
50.0%
â1
50.0%

director
Categorical

HIGH CARDINALITY

Distinct5067
Distinct (%)46.8%
Missing44
Missing (%)0.4%
Memory size85.0 KiB
Woody Allen
 
45
Clint Eastwood
 
34
Martin Scorsese
 
29
Steven Spielberg
 
29
Ridley Scott
 
23
Other values (5062)
10662 

Length

Max length533
Median length169
Mean length14.55812234
Min length2

Characters and Unicode

Total characters157548
Distinct characters96
Distinct categories18 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3217 ?
Unique (%)29.7%

Sample

1st rowColin Trevorrow
2nd rowGeorge Miller
3rd rowRobert Schwentke
4th rowJ.J. Abrams
5th rowJames Wan

Common Values

ValueCountFrequency (%)
Woody Allen45
 
0.4%
Clint Eastwood34
 
0.3%
Martin Scorsese29
 
0.3%
Steven Spielberg29
 
0.3%
Ridley Scott23
 
0.2%
Ron Howard22
 
0.2%
Steven Soderbergh22
 
0.2%
Joel Schumacher21
 
0.2%
Brian De Palma20
 
0.2%
Barry Levinson19
 
0.2%
Other values (5057)10558
97.2%
(Missing)44
 
0.4%

Length

2022-10-07T16:38:22.794917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
john436
 
1.8%
michael308
 
1.3%
david301
 
1.3%
robert212
 
0.9%
peter201
 
0.8%
james162
 
0.7%
richard159
 
0.7%
paul144
 
0.6%
mark110
 
0.5%
lee107
 
0.5%
Other values (6202)21600
91.0%

Most occurring characters

ValueCountFrequency (%)
e14631
 
9.3%
12934
 
8.2%
a12669
 
8.0%
n11046
 
7.0%
r10688
 
6.8%
o9160
 
5.8%
i9108
 
5.8%
l7511
 
4.8%
t5475
 
3.5%
s5207
 
3.3%
Other values (86)59119
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter116535
74.0%
Uppercase Letter25718
 
16.3%
Space Separator12936
 
8.2%
Math Symbol1088
 
0.7%
Other Punctuation839
 
0.5%
Dash Punctuation178
 
0.1%
Other Symbol118
 
0.1%
Format35
 
< 0.1%
Other Number32
 
< 0.1%
Modifier Symbol28
 
< 0.1%
Other values (8)41
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S2357
 
9.2%
J2208
 
8.6%
M2207
 
8.6%
R1800
 
7.0%
B1686
 
6.6%
C1612
 
6.3%
D1484
 
5.8%
A1433
 
5.6%
G1264
 
4.9%
L1223
 
4.8%
Other values (20)8444
32.8%
Lowercase Letter
ValueCountFrequency (%)
e14631
12.6%
a12669
10.9%
n11046
9.5%
r10688
9.2%
o9160
 
7.9%
i9108
 
7.8%
l7511
 
6.4%
t5475
 
4.7%
s5207
 
4.5%
h4545
 
3.9%
Other values (16)26495
22.7%
Other Punctuation
ValueCountFrequency (%)
.645
76.9%
¡65
 
7.7%
'52
 
6.2%
30
 
3.6%
,18
 
2.1%
§15
 
1.8%
5
 
0.6%
5
 
0.6%
4
 
0.5%
Modifier Symbol
ValueCountFrequency (%)
¨12
42.9%
´9
32.1%
¸5
17.9%
¯2
 
7.1%
Math Symbol
ValueCountFrequency (%)
|1070
98.3%
±17
 
1.6%
¬1
 
0.1%
Other Symbol
ValueCountFrequency (%)
©112
94.9%
¦5
 
4.2%
1
 
0.8%
Currency Symbol
ValueCountFrequency (%)
¥11
57.9%
¤7
36.8%
1
 
5.3%
Space Separator
ValueCountFrequency (%)
12934
> 99.9%
 2
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-177
99.4%
1
 
0.6%
Other Number
ValueCountFrequency (%)
³27
84.4%
¼5
 
15.6%
Open Punctuation
ValueCountFrequency (%)
4
80.0%
(1
 
20.0%
Final Punctuation
ValueCountFrequency (%)
»3
75.0%
1
 
25.0%
Initial Punctuation
ValueCountFrequency (%)
«3
75.0%
1
 
25.0%
Other Letter
ValueCountFrequency (%)
º3
75.0%
ª1
 
25.0%
Format
ValueCountFrequency (%)
­35
100.0%
Control
ValueCountFrequency (%)
3
100.0%
Decimal Number
ValueCountFrequency (%)
91
100.0%
Close Punctuation
ValueCountFrequency (%)
)1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin142257
90.3%
Common15291
 
9.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e14631
 
10.3%
a12669
 
8.9%
n11046
 
7.8%
r10688
 
7.5%
o9160
 
6.4%
i9108
 
6.4%
l7511
 
5.3%
t5475
 
3.8%
s5207
 
3.7%
h4545
 
3.2%
Other values (48)52217
36.7%
Common
ValueCountFrequency (%)
12934
84.6%
|1070
 
7.0%
.645
 
4.2%
-177
 
1.2%
©112
 
0.7%
¡65
 
0.4%
'52
 
0.3%
­35
 
0.2%
30
 
0.2%
³27
 
0.2%
Other values (28)144
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII156746
99.5%
None779
 
0.5%
Punctuation21
 
< 0.1%
Letterlike Symbols1
 
< 0.1%
Currency Symbols1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e14631
 
9.3%
12934
 
8.3%
a12669
 
8.1%
n11046
 
7.0%
r10688
 
6.8%
o9160
 
5.8%
i9108
 
5.8%
l7511
 
4.8%
t5475
 
3.5%
s5207
 
3.3%
Other values (52)58317
37.2%
None
ValueCountFrequency (%)
Ã377
48.4%
©112
 
14.4%
¡65
 
8.3%
­35
 
4.5%
30
 
3.9%
³27
 
3.5%
±17
 
2.2%
§15
 
1.9%
Å15
 
1.9%
Ä14
 
1.8%
Other values (15)72
 
9.2%
Punctuation
ValueCountFrequency (%)
5
23.8%
5
23.8%
4
19.0%
4
19.0%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
1
100.0%

tagline
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct7997
Distinct (%)99.4%
Missing2824
Missing (%)26.0%
Memory size85.0 KiB
Based on a true story.
 
5
Be careful what you wish for.
 
3
Two Films. One Love.
 
3
What you know about fear... doesn't even come close.
 
2
The chase is on!
 
2
Other values (7992)
8027 

Length

Max length286
Median length166
Mean length44.17445909
Min length1

Characters and Unicode

Total characters355251
Distinct characters126
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7957 ?
Unique (%)98.9%

Sample

1st rowThe park is open.
2nd rowWhat a Lovely Day.
3rd rowOne Choice Can Destroy You
4th rowEvery generation has a story.
5th rowVengeance Hits Home

Common Values

ValueCountFrequency (%)
Based on a true story.5
 
< 0.1%
Be careful what you wish for.3
 
< 0.1%
Two Films. One Love.3
 
< 0.1%
What you know about fear... doesn't even come close.2
 
< 0.1%
The chase is on!2
 
< 0.1%
Love is a force of nature.2
 
< 0.1%
There is no turning back2
 
< 0.1%
It's A Trap2
 
< 0.1%
Misery loves family.2
 
< 0.1%
Who is John Galt?2
 
< 0.1%
Other values (7987)8017
73.8%
(Missing)2824
 
26.0%

Length

2022-10-07T16:38:23.013896image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the3989
 
6.1%
a2410
 
3.7%
to1460
 
2.2%
of1335
 
2.0%
is1247
 
1.9%
you1095
 
1.7%
in951
 
1.5%
and804
 
1.2%
one619
 
0.9%
it611
 
0.9%
Other values (7019)50857
77.8%

Most occurring characters

ValueCountFrequency (%)
57392
16.2%
e36795
 
10.4%
t22060
 
6.2%
o21674
 
6.1%
a18810
 
5.3%
n17991
 
5.1%
i17300
 
4.9%
r16801
 
4.7%
s16059
 
4.5%
h14126
 
4.0%
Other values (116)116243
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter258603
72.8%
Space Separator57393
 
16.2%
Uppercase Letter21147
 
6.0%
Other Punctuation16563
 
4.7%
Decimal Number950
 
0.3%
Dash Punctuation364
 
0.1%
Currency Symbol96
 
< 0.1%
Other Symbol83
 
< 0.1%
Open Punctuation17
 
< 0.1%
Close Punctuation15
 
< 0.1%
Other values (6)20
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e36795
14.2%
t22060
 
8.5%
o21674
 
8.4%
a18810
 
7.3%
n17991
 
7.0%
i17300
 
6.7%
r16801
 
6.5%
s16059
 
6.2%
h14126
 
5.5%
l11024
 
4.3%
Other values (26)65963
25.5%
Uppercase Letter
ValueCountFrequency (%)
T3179
15.0%
A1962
 
9.3%
S1542
 
7.3%
H1327
 
6.3%
I1262
 
6.0%
W1211
 
5.7%
B1038
 
4.9%
F899
 
4.3%
N896
 
4.2%
E889
 
4.2%
Other values (25)6942
32.8%
Other Punctuation
ValueCountFrequency (%)
.10962
66.2%
'2278
 
13.8%
,1587
 
9.6%
!1041
 
6.3%
?481
 
2.9%
"78
 
0.5%
:40
 
0.2%
*25
 
0.2%
&22
 
0.1%
#16
 
0.1%
Other values (8)33
 
0.2%
Decimal Number
ValueCountFrequency (%)
0266
28.0%
1183
19.3%
2109
11.5%
380
 
8.4%
976
 
8.0%
554
 
5.7%
454
 
5.7%
649
 
5.2%
745
 
4.7%
834
 
3.6%
Open Punctuation
ValueCountFrequency (%)
(13
76.5%
[2
 
11.8%
1
 
5.9%
1
 
5.9%
Modifier Symbol
ValueCountFrequency (%)
˜1
25.0%
¯1
25.0%
´1
25.0%
`1
25.0%
Currency Symbol
ValueCountFrequency (%)
88
91.7%
$7
 
7.3%
¤1
 
1.0%
Other Symbol
ValueCountFrequency (%)
48
57.8%
¦34
41.0%
©1
 
1.2%
Math Symbol
ValueCountFrequency (%)
=3
42.9%
|2
28.6%
+2
28.6%
Space Separator
ValueCountFrequency (%)
57392
> 99.9%
 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
)13
86.7%
]2
 
13.3%
Other Number
ValueCountFrequency (%)
½2
66.7%
²1
33.3%
Dash Punctuation
ValueCountFrequency (%)
-364
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ3
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin279750
78.7%
Common75501
 
21.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e36795
13.2%
t22060
 
7.9%
o21674
 
7.7%
a18810
 
6.7%
n17991
 
6.4%
i17300
 
6.2%
r16801
 
6.0%
s16059
 
5.7%
h14126
 
5.0%
l11024
 
3.9%
Other values (61)87110
31.1%
Common
ValueCountFrequency (%)
57392
76.0%
.10962
 
14.5%
'2278
 
3.0%
,1587
 
2.1%
!1041
 
1.4%
?481
 
0.6%
-364
 
0.5%
0266
 
0.4%
1183
 
0.2%
2109
 
0.1%
Other values (45)838
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII354937
99.9%
None166
 
< 0.1%
Currency Symbols88
 
< 0.1%
Letterlike Symbols48
 
< 0.1%
Punctuation8
 
< 0.1%
Modifier Letters4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
57392
16.2%
e36795
 
10.4%
t22060
 
6.2%
o21674
 
6.1%
a18810
 
5.3%
n17991
 
5.1%
i17300
 
4.9%
r16801
 
4.7%
s16059
 
4.5%
h14126
 
4.0%
Other values (78)115929
32.7%
Currency Symbols
ValueCountFrequency (%)
88
100.0%
None
ValueCountFrequency (%)
â86
51.8%
¦34
 
20.5%
Â5
 
3.0%
ã4
 
2.4%
å3
 
1.8%
ƒ3
 
1.8%
œ3
 
1.8%
·2
 
1.2%
Ë2
 
1.2%
É2
 
1.2%
Other values (18)22
 
13.3%
Letterlike Symbols
ValueCountFrequency (%)
48
100.0%
Modifier Letters
ValueCountFrequency (%)
ˆ3
75.0%
˜1
 
25.0%
Punctuation
ValueCountFrequency (%)
2
25.0%
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

keywords
Categorical

HIGH CARDINALITY
MISSING

Distinct8804
Distinct (%)93.9%
Missing1493
Missing (%)13.7%
Memory size85.0 KiB
woman director
 
134
independent film
 
82
sport
 
25
duringcreditsstinger
 
24
suspense
 
24
Other values (8799)
9084 

Length

Max length131
Median length88
Mean length41.97247413
Min length2

Characters and Unicode

Total characters393408
Distinct characters87
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8673 ?
Unique (%)92.5%

Sample

1st rowmonster|dna|tyrannosaurus rex|velociraptor|island
2nd rowfuture|chase|post-apocalyptic|dystopia|australia
3rd rowbased on novel|revolution|dystopia|sequel|dystopic future
4th rowandroid|spaceship|jedi|space opera|3d
5th rowcar race|speed|revenge|suspense|car

Common Values

ValueCountFrequency (%)
woman director134
 
1.2%
independent film82
 
0.8%
sport25
 
0.2%
duringcreditsstinger24
 
0.2%
suspense24
 
0.2%
musical24
 
0.2%
holiday16
 
0.1%
stand-up|stand up comedy16
 
0.1%
biography15
 
0.1%
independent film|woman director13
 
0.1%
Other values (8794)9000
82.8%
(Missing)1493
 
13.7%

Length

2022-10-07T16:38:23.241569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
of599
 
2.2%
on546
 
2.0%
director357
 
1.3%
film231
 
0.9%
and231
 
0.9%
new203
 
0.8%
in190
 
0.7%
the177
 
0.7%
brother164
 
0.6%
based163
 
0.6%
Other values (16457)24006
89.4%

Most occurring characters

ValueCountFrequency (%)
e37321
 
9.5%
i29650
 
7.5%
a29180
 
7.4%
|28077
 
7.1%
r27999
 
7.1%
o25158
 
6.4%
n24736
 
6.3%
t23340
 
5.9%
s22514
 
5.7%
17513
 
4.5%
Other values (77)127920
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter346243
88.0%
Math Symbol28078
 
7.1%
Space Separator17536
 
4.5%
Dash Punctuation570
 
0.1%
Other Punctuation423
 
0.1%
Decimal Number366
 
0.1%
Uppercase Letter73
 
< 0.1%
Open Punctuation30
 
< 0.1%
Close Punctuation28
 
< 0.1%
Other Symbol21
 
< 0.1%
Other values (7)40
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e37321
 
10.8%
i29650
 
8.6%
a29180
 
8.4%
r27999
 
8.1%
o25158
 
7.3%
n24736
 
7.1%
t23340
 
6.7%
s22514
 
6.5%
l17116
 
4.9%
c13962
 
4.0%
Other values (27)95267
27.5%
Other Punctuation
ValueCountFrequency (%)
.246
58.2%
'156
36.9%
7
 
1.7%
§3
 
0.7%
3
 
0.7%
¡2
 
0.5%
&2
 
0.5%
1
 
0.2%
1
 
0.2%
,1
 
0.2%
Decimal Number
ValueCountFrequency (%)
182
22.4%
070
19.1%
968
18.6%
759
16.1%
341
11.2%
214
 
3.8%
512
 
3.3%
612
 
3.3%
44
 
1.1%
84
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
Ã42
57.5%
Â23
31.5%
Ÿ5
 
6.8%
Î2
 
2.7%
Š1
 
1.4%
Other Symbol
ValueCountFrequency (%)
©17
81.0%
¦3
 
14.3%
°1
 
4.8%
Currency Symbol
ValueCountFrequency (%)
¤6
46.2%
5
38.5%
¥2
 
15.4%
Modifier Symbol
ValueCountFrequency (%)
´2
33.3%
¸2
33.3%
˜2
33.3%
Math Symbol
ValueCountFrequency (%)
|28077
> 99.9%
¬1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
17513
99.9%
 23
 
0.1%
Open Punctuation
ValueCountFrequency (%)
(28
93.3%
2
 
6.7%
Initial Punctuation
ValueCountFrequency (%)
6
85.7%
«1
 
14.3%
Other Number
ValueCountFrequency (%)
¼3
75.0%
³1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
-570
100.0%
Close Punctuation
ValueCountFrequency (%)
)28
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ4
100.0%
Other Letter
ValueCountFrequency (%)
º3
100.0%
Final Punctuation
ValueCountFrequency (%)
»3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin346319
88.0%
Common47089
 
12.0%

Most frequent character per script

Common
ValueCountFrequency (%)
|28077
59.6%
17513
37.2%
-570
 
1.2%
.246
 
0.5%
'156
 
0.3%
182
 
0.2%
070
 
0.1%
968
 
0.1%
759
 
0.1%
341
 
0.1%
Other values (34)207
 
0.4%
Latin
ValueCountFrequency (%)
e37321
 
10.8%
i29650
 
8.6%
a29180
 
8.4%
r27999
 
8.1%
o25158
 
7.3%
n24736
 
7.1%
t23340
 
6.7%
s22514
 
6.5%
l17116
 
4.9%
c13962
 
4.0%
Other values (33)95343
27.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII393202
99.9%
None182
 
< 0.1%
Punctuation13
 
< 0.1%
Modifier Letters6
 
< 0.1%
Currency Symbols5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e37321
 
9.5%
i29650
 
7.5%
a29180
 
7.4%
|28077
 
7.1%
r27999
 
7.1%
o25158
 
6.4%
n24736
 
6.3%
t23340
 
5.9%
s22514
 
5.7%
17513
 
4.5%
Other values (35)127714
32.5%
None
ValueCountFrequency (%)
Ã42
23.1%
 23
12.6%
Â23
12.6%
©17
 
9.3%
å8
 
4.4%
7
 
3.8%
¤6
 
3.3%
â5
 
2.7%
Ÿ5
 
2.7%
§3
 
1.6%
Other values (24)43
23.6%
Punctuation
ValueCountFrequency (%)
6
46.2%
3
23.1%
2
 
15.4%
1
 
7.7%
1
 
7.7%
Currency Symbols
ValueCountFrequency (%)
5
100.0%
Modifier Letters
ValueCountFrequency (%)
ˆ4
66.7%
˜2
33.3%

overview
Categorical

HIGH CARDINALITY
UNIFORM

Distinct10847
Distinct (%)99.9%
Missing4
Missing (%)< 0.1%
Memory size85.0 KiB
No overview found.
 
13
In the year of 2039, after World Wars destroy much of the civilization as we know it, territories are no longer run by governments, but by corporations; the mightiest of which is the Mishima Zaibatsu. In order to placate the seething masses of this dystopia, Mishima sponsors Tekken, a tournament in which fighters battle until only one is left standing.
 
2
Wilbur the pig is scared of the end of the season, because he knows that come that time, he will end up on the dinner table. He hatches a plan with Charlotte, a spider that lives in his pen, to ensure that this will never happen.
 
2
1960. The thrilling battles waged by a band of kids from two rival villages in the southern French countryside.
 
2
The year is 1965 and Danny Embling, is an awkward, underdeveloped teen suffering from occasional bouts of stuttering, attends an all-male boarding school in New South Wales, Australia. it has been some time since Danny has had any romantic relationship with a girl. He slowly becomes interested in Thandiwe Adjewa, a Ugandan-Kenyan-British girl attending the all-girls school across the lake.
 
1
Other values (10842)
10842 

Length

Max length1000
Median length738
Mean length307.0375621
Min length13

Characters and Unicode

Total characters3335042
Distinct characters144
Distinct categories20 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10843 ?
Unique (%)99.8%

Sample

1st rowTwenty-two years after the events of Jurassic Park, Isla Nublar now features a fully functioning dinosaur theme park, Jurassic World, as originally envisioned by John Hammond.
2nd rowAn apocalyptic story set in the furthest reaches of our planet, in a stark desert landscape where humanity is broken, and most everyone is crazed fighting for the necessities of life. Within this world exist two rebels on the run who just might be able to restore order. There's Max, a man of action and a man of few words, who seeks peace of mind following the loss of his wife and child in the aftermath of the chaos. And Furiosa, a woman of action and a woman who believes her path to survival may be achieved if she can make it across the desert back to her childhood homeland.
3rd rowBeatrice Prior must confront her inner demons and continue her fight against a powerful alliance which threatens to tear her society apart.
4th rowThirty years after defeating the Galactic Empire, Han Solo and his allies face a new threat from the evil Kylo Ren and his army of Stormtroopers.
5th rowDeckard Shaw seeks revenge against Dominic Toretto and his family for his comatose brother.

Common Values

ValueCountFrequency (%)
No overview found.13
 
0.1%
In the year of 2039, after World Wars destroy much of the civilization as we know it, territories are no longer run by governments, but by corporations; the mightiest of which is the Mishima Zaibatsu. In order to placate the seething masses of this dystopia, Mishima sponsors Tekken, a tournament in which fighters battle until only one is left standing.2
 
< 0.1%
Wilbur the pig is scared of the end of the season, because he knows that come that time, he will end up on the dinner table. He hatches a plan with Charlotte, a spider that lives in his pen, to ensure that this will never happen.2
 
< 0.1%
1960. The thrilling battles waged by a band of kids from two rival villages in the southern French countryside.2
 
< 0.1%
The year is 1965 and Danny Embling, is an awkward, underdeveloped teen suffering from occasional bouts of stuttering, attends an all-male boarding school in New South Wales, Australia. it has been some time since Danny has had any romantic relationship with a girl. He slowly becomes interested in Thandiwe Adjewa, a Ugandan-Kenyan-British girl attending the all-girls school across the lake.1
 
< 0.1%
Deck the halls with Donkey’s laughter in this all-new holiday collection. Donkey presents his very own carolling stage show featuring his Far Far Away pals in this merry, musical treat with all the trimmings! Join in the fun as they bring their own Shrektacular spirit to festive holiday songs, a fun Donkey Decoration scramble, and a hilarious virtual Yule Log that’s so funny… it’s on fire.1
 
< 0.1%
Eager to provide a better future for her son, Fadi (Melkar Muallem), divorcée Muna Farah (Nisreen Faour) leaves her Palestinian homeland and takes up residence in rural Illinois -- just in time to encounter the domestic repercussions of America's disastrous war in Iraq. Now, the duo must reinvent their lives with some help from Muna's sister, Raghda (Hiam Abbass), and brother-in-law, Nabeel (Yussuf Abu-Warda). Cherien Dabis writes and directs.1
 
< 0.1%
King Randolph sends for his cousin Duchess Rowena to help turn his daughters, Princess Genevieve and her 11 sisters, into better ladies. But the Duchess takes away all the sisters fun, including the sisters favorite pastime: dancing.Thinking all hope is lost they find a secret passageway to a magical land were they can dance the night away.1
 
< 0.1%
Quantum of Solace continues the adventures of James Bond after Casino Royale. Betrayed by Vesper, the woman he loved, 007 fights the urge to make his latest mission personal. Pursuing his determination to uncover the truth, Bond and M interrogate Mr. White, who reveals that the organization that blackmailed Vesper is far more complex and dangerous than anyone had imagined.1
 
< 0.1%
A reporter becomes the target of a vicious smear campaign that drives him to the point of suicide after he exposes the CIA's role in arming Contra rebels in Nicaragua and importing cocaine into California. Based on the true story of journalist Gary Webb.1
 
< 0.1%
Other values (10837)10837
99.7%
(Missing)4
 
< 0.1%

Length

2022-10-07T16:38:23.439094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the31655
 
5.6%
a23240
 
4.1%
to17549
 
3.1%
and16956
 
3.0%
of15597
 
2.7%
in10160
 
1.8%
his8607
 
1.5%
is7865
 
1.4%
with5514
 
1.0%
her4793
 
0.8%
Other values (38551)425509
75.0%

Most occurring characters

ValueCountFrequency (%)
556977
16.7%
e320395
 
9.6%
t219691
 
6.6%
a215612
 
6.5%
i194998
 
5.8%
o191213
 
5.7%
n191104
 
5.7%
s177760
 
5.3%
r174774
 
5.2%
h139782
 
4.2%
Other values (134)952736
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2591110
77.7%
Space Separator557010
 
16.7%
Uppercase Letter90606
 
2.7%
Other Punctuation70522
 
2.1%
Dash Punctuation9126
 
0.3%
Decimal Number8887
 
0.3%
Open Punctuation1981
 
0.1%
Close Punctuation1977
 
0.1%
Currency Symbol1893
 
0.1%
Other Symbol1256
 
< 0.1%
Other values (10)674
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A10241
 
11.3%
T7658
 
8.5%
S6972
 
7.7%
B6009
 
6.6%
C5663
 
6.3%
M5455
 
6.0%
W4604
 
5.1%
H4191
 
4.6%
D4027
 
4.4%
J3560
 
3.9%
Other values (23)32226
35.6%
Lowercase Letter
ValueCountFrequency (%)
e320395
12.4%
t219691
 
8.5%
a215612
 
8.3%
i194998
 
7.5%
o191213
 
7.4%
n191104
 
7.4%
s177760
 
6.9%
r174774
 
6.7%
h139782
 
5.4%
l111503
 
4.3%
Other values (21)654278
25.3%
Other Punctuation
ValueCountFrequency (%)
,30277
42.9%
.27504
39.0%
'7736
 
11.0%
"2437
 
3.5%
:793
 
1.1%
?575
 
0.8%
;471
 
0.7%
!416
 
0.6%
/151
 
0.2%
&89
 
0.1%
Other values (11)73
 
0.1%
Decimal Number
ValueCountFrequency (%)
12014
22.7%
01884
21.2%
91229
13.8%
2987
11.1%
5508
 
5.7%
3489
 
5.5%
8477
 
5.4%
7463
 
5.2%
4430
 
4.8%
6406
 
4.6%
Currency Symbol
ValueCountFrequency (%)
1756
92.8%
$99
 
5.2%
¢15
 
0.8%
£12
 
0.6%
¤7
 
0.4%
¥4
 
0.2%
Modifier Symbol
ValueCountFrequency (%)
˜48
47.1%
¨30
29.4%
¯12
 
11.8%
´8
 
7.8%
`3
 
2.9%
¸1
 
1.0%
Other Number
ValueCountFrequency (%)
³11
34.4%
¼10
31.2%
¹4
 
12.5%
²3
 
9.4%
¾2
 
6.2%
½2
 
6.2%
Other Symbol
ValueCountFrequency (%)
883
70.3%
©283
 
22.5%
¦71
 
5.7%
®18
 
1.4%
°1
 
0.1%
Final Punctuation
ValueCountFrequency (%)
174
95.1%
»5
 
2.7%
3
 
1.6%
1
 
0.5%
Math Symbol
ValueCountFrequency (%)
±4
40.0%
¬3
30.0%
=2
20.0%
|1
 
10.0%
Open Punctuation
ValueCountFrequency (%)
(1967
99.3%
[10
 
0.5%
4
 
0.2%
Initial Punctuation
ValueCountFrequency (%)
295
96.7%
«7
 
2.3%
3
 
1.0%
Space Separator
ValueCountFrequency (%)
556977
> 99.9%
 33
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-9124
> 99.9%
2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
)1967
99.5%
]10
 
0.5%
Other Letter
ValueCountFrequency (%)
ª10
76.9%
º3
 
23.1%
Control
ValueCountFrequency (%)
13
100.0%
Format
ValueCountFrequency (%)
­13
100.0%
Connector Punctuation
ValueCountFrequency (%)
_2
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2681728
80.4%
Common653314
 
19.6%

Most frequent character per script

Common
ValueCountFrequency (%)
556977
85.3%
,30277
 
4.6%
.27504
 
4.2%
-9124
 
1.4%
'7736
 
1.2%
"2437
 
0.4%
12014
 
0.3%
(1967
 
0.3%
)1967
 
0.3%
01884
 
0.3%
Other values (69)11427
 
1.7%
Latin
ValueCountFrequency (%)
e320395
11.9%
t219691
 
8.2%
a215612
 
8.0%
i194998
 
7.3%
o191213
 
7.1%
n191104
 
7.1%
s177760
 
6.6%
r174774
 
6.5%
h139782
 
5.2%
l111503
 
4.2%
Other values (55)744896
27.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII3328794
99.8%
None3070
 
0.1%
Currency Symbols1756
 
0.1%
Letterlike Symbols883
 
< 0.1%
Punctuation490
 
< 0.1%
Modifier Letters49
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
556977
16.7%
e320395
 
9.6%
t219691
 
6.6%
a215612
 
6.5%
i194998
 
5.9%
o191213
 
5.7%
n191104
 
5.7%
s177760
 
5.3%
r174774
 
5.3%
h139782
 
4.2%
Other values (77)946488
28.4%
None
ValueCountFrequency (%)
â1762
57.4%
Ã489
 
15.9%
©283
 
9.2%
œ137
 
4.5%
¦71
 
2.3%
Â49
 
1.6%
 33
 
1.1%
¨30
 
1.0%
¡23
 
0.7%
®18
 
0.6%
Other values (32)175
 
5.7%
Currency Symbols
ValueCountFrequency (%)
1756
100.0%
Letterlike Symbols
ValueCountFrequency (%)
883
100.0%
Punctuation
ValueCountFrequency (%)
295
60.2%
174
35.5%
4
 
0.8%
3
 
0.6%
3
 
0.6%
3
 
0.6%
2
 
0.4%
2
 
0.4%
2
 
0.4%
1
 
0.2%
Modifier Letters
ValueCountFrequency (%)
˜48
98.0%
ˆ1
 
2.0%

runtime
Real number (ℝ≥0)

Distinct247
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102.0708632
Minimum0
Maximum900
Zeros31
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:23.655095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile75
Q190
median99
Q3111
95-th percentile139
Maximum900
Range900
Interquartile range (IQR)21

Descriptive statistics

Standard deviation31.38140508
Coefficient of variation (CV)0.307447239
Kurtosis116.2375673
Mean102.0708632
Median Absolute Deviation (MAD)10
Skewness6.103792811
Sum1109102
Variance984.7925849
MonotonicityNot monotonic
2022-10-07T16:38:23.837156image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90547
 
5.0%
95358
 
3.3%
100335
 
3.1%
93328
 
3.0%
97306
 
2.8%
96300
 
2.8%
91297
 
2.7%
94292
 
2.7%
92270
 
2.5%
98270
 
2.5%
Other values (237)7563
69.6%
ValueCountFrequency (%)
031
0.3%
25
 
< 0.1%
311
 
0.1%
417
0.2%
517
0.2%
622
0.2%
717
0.2%
89
 
0.1%
97
 
0.1%
106
 
0.1%
ValueCountFrequency (%)
9001
< 0.1%
8771
< 0.1%
7051
< 0.1%
5661
< 0.1%
5611
< 0.1%
5501
< 0.1%
5401
< 0.1%
5011
< 0.1%
5001
< 0.1%
4701
< 0.1%

genres
Categorical

HIGH CARDINALITY

Distinct2039
Distinct (%)18.8%
Missing23
Missing (%)0.2%
Memory size85.0 KiB
Drama
 
712
Comedy
 
712
Documentary
 
312
Drama|Romance
 
289
Comedy|Drama
 
280
Other values (2034)
8538 

Length

Max length51
Median length44
Mean length18.53712072
Min length3

Characters and Unicode

Total characters200998
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1225 ?
Unique (%)11.3%

Sample

1st rowAction|Adventure|Science Fiction|Thriller
2nd rowAction|Adventure|Science Fiction|Thriller
3rd rowAdventure|Science Fiction|Thriller
4th rowAction|Adventure|Science Fiction|Fantasy
5th rowAction|Crime|Thriller

Common Values

ValueCountFrequency (%)
Drama712
 
6.6%
Comedy712
 
6.6%
Documentary312
 
2.9%
Drama|Romance289
 
2.7%
Comedy|Drama280
 
2.6%
Comedy|Romance268
 
2.5%
Horror|Thriller259
 
2.4%
Horror253
 
2.3%
Comedy|Drama|Romance222
 
2.0%
Drama|Thriller138
 
1.3%
Other values (2029)7398
68.1%

Length

2022-10-07T16:38:24.076482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
drama712
 
5.8%
comedy712
 
5.8%
fiction670
 
5.5%
documentary312
 
2.5%
drama|romance289
 
2.4%
comedy|drama280
 
2.3%
comedy|romance268
 
2.2%
horror|thriller259
 
2.1%
horror253
 
2.1%
comedy|drama|romance222
 
1.8%
Other values (1900)8263
67.5%

Most occurring characters

ValueCountFrequency (%)
r20601
 
10.2%
e17185
 
8.5%
|16117
 
8.0%
a15786
 
7.9%
o14302
 
7.1%
m14071
 
7.0%
i14064
 
7.0%
n11215
 
5.6%
c8715
 
4.3%
t8530
 
4.2%
Other values (20)60412
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter154960
77.1%
Uppercase Letter28524
 
14.2%
Math Symbol16117
 
8.0%
Space Separator1397
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r20601
13.3%
e17185
11.1%
a15786
10.2%
o14302
9.2%
m14071
9.1%
i14064
9.1%
n11215
7.2%
c8715
 
5.6%
t8530
 
5.5%
y8414
 
5.4%
Other values (7)22077
14.2%
Uppercase Letter
ValueCountFrequency (%)
D5281
18.5%
C5148
18.0%
A4555
16.0%
F3565
12.5%
T3075
10.8%
H1971
 
6.9%
R1712
 
6.0%
M1385
 
4.9%
S1230
 
4.3%
W435
 
1.5%
Math Symbol
ValueCountFrequency (%)
|16117
100.0%
Space Separator
ValueCountFrequency (%)
1397
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin183484
91.3%
Common17514
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r20601
11.2%
e17185
 
9.4%
a15786
 
8.6%
o14302
 
7.8%
m14071
 
7.7%
i14064
 
7.7%
n11215
 
6.1%
c8715
 
4.7%
t8530
 
4.6%
y8414
 
4.6%
Other values (18)50601
27.6%
Common
ValueCountFrequency (%)
|16117
92.0%
1397
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII200998
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r20601
 
10.2%
e17185
 
8.5%
|16117
 
8.0%
a15786
 
7.9%
o14302
 
7.1%
m14071
 
7.0%
i14064
 
7.0%
n11215
 
5.6%
c8715
 
4.3%
t8530
 
4.2%
Other values (20)60412
30.1%

production_companies
Categorical

HIGH CARDINALITY
MISSING

Distinct7445
Distinct (%)75.7%
Missing1030
Missing (%)9.5%
Memory size85.0 KiB
Paramount Pictures
 
156
Universal Pictures
 
133
Warner Bros.
 
84
Walt Disney Pictures
 
76
Metro-Goldwyn-Mayer (MGM)
 
72
Other values (7440)
9315 

Length

Max length184
Median length128
Mean length45.51616511
Min length3

Characters and Unicode

Total characters447697
Distinct characters120
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6850 ?
Unique (%)69.6%

Sample

1st rowUniversal Studios|Amblin Entertainment|Legendary Pictures|Fuji Television Network|Dentsu
2nd rowVillage Roadshow Pictures|Kennedy Miller Productions
3rd rowSummit Entertainment|Mandeville Films|Red Wagon Entertainment|NeoReel
4th rowLucasfilm|Truenorth Productions|Bad Robot
5th rowUniversal Pictures|Original Film|Media Rights Capital|Dentsu|One Race Films

Common Values

ValueCountFrequency (%)
Paramount Pictures156
 
1.4%
Universal Pictures133
 
1.2%
Warner Bros.84
 
0.8%
Walt Disney Pictures76
 
0.7%
Metro-Goldwyn-Mayer (MGM)72
 
0.7%
Columbia Pictures72
 
0.7%
New Line Cinema61
 
0.6%
Touchstone Pictures51
 
0.5%
20th Century Fox50
 
0.5%
Twentieth Century Fox Film Corporation49
 
0.5%
Other values (7435)9032
83.1%
(Missing)1030
 
9.5%

Length

2022-10-07T16:38:24.327088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pictures1880
 
4.3%
productions1750
 
4.0%
films1571
 
3.6%
entertainment1187
 
2.7%
film1183
 
2.7%
universal512
 
1.2%
fox480
 
1.1%
paramount442
 
1.0%
century421
 
1.0%
columbia412
 
0.9%
Other values (12817)34206
77.7%

Most occurring characters

ValueCountFrequency (%)
i35180
 
7.9%
34207
 
7.6%
e33390
 
7.5%
n31750
 
7.1%
t30796
 
6.9%
r29487
 
6.6%
o26480
 
5.9%
a24331
 
5.4%
s21969
 
4.9%
l15887
 
3.5%
Other values (110)164220
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter329907
73.7%
Uppercase Letter63468
 
14.2%
Space Separator34213
 
7.6%
Math Symbol13544
 
3.0%
Other Punctuation2192
 
0.5%
Decimal Number1645
 
0.4%
Dash Punctuation849
 
0.2%
Open Punctuation721
 
0.2%
Close Punctuation720
 
0.2%
Other Symbol320
 
0.1%
Other values (7)118
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i35180
10.7%
e33390
10.1%
n31750
9.6%
t30796
9.3%
r29487
8.9%
o26480
8.0%
a24331
 
7.4%
s21969
 
6.7%
l15887
 
4.8%
u15355
 
4.7%
Other values (21)65282
19.8%
Uppercase Letter
ValueCountFrequency (%)
P10513
16.6%
F7597
12.0%
C5813
 
9.2%
M3907
 
6.2%
E3880
 
6.1%
S3731
 
5.9%
B2962
 
4.7%
T2804
 
4.4%
A2700
 
4.3%
G2331
 
3.7%
Other values (20)17230
27.1%
Other Punctuation
ValueCountFrequency (%)
.1411
64.4%
/273
 
12.5%
&214
 
9.8%
,119
 
5.4%
'88
 
4.0%
20
 
0.9%
¡18
 
0.8%
!11
 
0.5%
11
 
0.5%
§7
 
0.3%
Other values (10)20
 
0.9%
Decimal Number
ValueCountFrequency (%)
2418
25.4%
0401
24.4%
1204
12.4%
4170
10.3%
3145
 
8.8%
983
 
5.0%
669
 
4.2%
763
 
3.8%
849
 
3.0%
543
 
2.6%
Modifier Symbol
ValueCountFrequency (%)
¨8
50.0%
´3
 
18.8%
¯3
 
18.8%
˜1
 
6.2%
¸1
 
6.2%
Math Symbol
ValueCountFrequency (%)
|13391
98.9%
+132
 
1.0%
±21
 
0.2%
Open Punctuation
ValueCountFrequency (%)
(719
99.7%
[1
 
0.1%
1
 
0.1%
Currency Symbol
ValueCountFrequency (%)
¤20
87.0%
¢2
 
8.7%
¥1
 
4.3%
Space Separator
ValueCountFrequency (%)
34207
> 99.9%
 6
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-848
99.9%
1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
)719
99.9%
]1
 
0.1%
Other Symbol
ValueCountFrequency (%)
©314
98.1%
°6
 
1.9%
Other Number
ValueCountFrequency (%)
³48
90.6%
¼5
 
9.4%
Other Letter
ValueCountFrequency (%)
ª3
60.0%
º2
40.0%
Format
ValueCountFrequency (%)
­18
100.0%
Connector Punctuation
ValueCountFrequency (%)
_2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin393377
87.9%
Common54320
 
12.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i35180
 
8.9%
e33390
 
8.5%
n31750
 
8.1%
t30796
 
7.8%
r29487
 
7.5%
o26480
 
6.7%
a24331
 
6.2%
s21969
 
5.6%
l15887
 
4.0%
u15355
 
3.9%
Other values (52)128752
32.7%
Common
ValueCountFrequency (%)
34207
63.0%
|13391
 
24.7%
.1411
 
2.6%
-848
 
1.6%
)719
 
1.3%
(719
 
1.3%
2418
 
0.8%
0401
 
0.7%
©314
 
0.6%
/273
 
0.5%
Other values (48)1619
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII446644
99.8%
None1027
 
0.2%
Punctuation25
 
< 0.1%
Modifier Letters1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i35180
 
7.9%
34207
 
7.7%
e33390
 
7.5%
n31750
 
7.1%
t30796
 
6.9%
r29487
 
6.6%
o26480
 
5.9%
a24331
 
5.4%
s21969
 
4.9%
l15887
 
3.6%
Other values (75)163167
36.5%
None
ValueCountFrequency (%)
Ã514
50.0%
©314
30.6%
³48
 
4.7%
±21
 
2.0%
¤20
 
1.9%
­18
 
1.8%
¡18
 
1.8%
11
 
1.1%
¨8
 
0.8%
§7
 
0.7%
Other values (18)48
 
4.7%
Punctuation
ValueCountFrequency (%)
20
80.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Modifier Letters
ValueCountFrequency (%)
˜1
100.0%

release_date
Categorical

HIGH CARDINALITY

Distinct5909
Distinct (%)54.4%
Missing0
Missing (%)0.0%
Memory size85.0 KiB
1/1/09
 
28
1/1/08
 
21
1/1/07
 
18
1/1/05
 
16
10/10/14
 
15
Other values (5904)
10768 

Length

Max length8
Median length7
Mean length6.960058899
Min length6

Characters and Unicode

Total characters75628
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3385 ?
Unique (%)31.2%

Sample

1st row6/9/15
2nd row5/13/15
3rd row3/18/15
4th row12/15/15
5th row4/1/15

Common Values

ValueCountFrequency (%)
1/1/0928
 
0.3%
1/1/0821
 
0.2%
1/1/0718
 
0.2%
1/1/0516
 
0.1%
10/10/1415
 
0.1%
1/1/0613
 
0.1%
9/7/1213
 
0.1%
1/1/0313
 
0.1%
1/1/1212
 
0.1%
10/16/1512
 
0.1%
Other values (5899)10705
98.5%

Length

2022-10-07T16:38:24.705460image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1/1/0928
 
0.3%
1/1/0821
 
0.2%
1/1/0718
 
0.2%
1/1/0516
 
0.1%
10/10/1415
 
0.1%
1/1/0613
 
0.1%
9/7/1213
 
0.1%
1/1/0313
 
0.1%
10/16/1512
 
0.1%
10/14/1112
 
0.1%
Other values (5899)10705
98.5%

Most occurring characters

ValueCountFrequency (%)
/21732
28.7%
114800
19.6%
27072
 
9.4%
06710
 
8.9%
95065
 
6.7%
83932
 
5.2%
33528
 
4.7%
53295
 
4.4%
73259
 
4.3%
43219
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number53896
71.3%
Other Punctuation21732
28.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
114800
27.5%
27072
13.1%
06710
12.4%
95065
 
9.4%
83932
 
7.3%
33528
 
6.5%
53295
 
6.1%
73259
 
6.0%
43219
 
6.0%
63016
 
5.6%
Other Punctuation
ValueCountFrequency (%)
/21732
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common75628
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/21732
28.7%
114800
19.6%
27072
 
9.4%
06710
 
8.9%
95065
 
6.7%
83932
 
5.2%
33528
 
4.7%
53295
 
4.4%
73259
 
4.3%
43219
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII75628
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/21732
28.7%
114800
19.6%
27072
 
9.4%
06710
 
8.9%
95065
 
6.7%
83932
 
5.2%
33528
 
4.7%
53295
 
4.4%
73259
 
4.3%
43219
 
4.3%

vote_count
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1289
Distinct (%)11.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean217.3897478
Minimum10
Maximum9767
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:24.935973image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q117
median38
Q3145.75
95-th percentile1025.75
Maximum9767
Range9757
Interquartile range (IQR)128.75

Descriptive statistics

Standard deviation575.6190577
Coefficient of variation (CV)2.647866624
Kurtosis53.36097878
Mean217.3897478
Median Absolute Deviation (MAD)26
Skewness6.177305758
Sum2362157
Variance331337.2996
MonotonicityNot monotonic
2022-10-07T16:38:25.107654image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10501
 
4.6%
11474
 
4.4%
12422
 
3.9%
13377
 
3.5%
14323
 
3.0%
15300
 
2.8%
16270
 
2.5%
17256
 
2.4%
18218
 
2.0%
19189
 
1.7%
Other values (1279)7536
69.4%
ValueCountFrequency (%)
10501
4.6%
11474
4.4%
12422
3.9%
13377
3.5%
14323
3.0%
15300
2.8%
16270
2.5%
17256
2.4%
18218
2.0%
19189
 
1.7%
ValueCountFrequency (%)
97671
< 0.1%
89031
< 0.1%
84581
< 0.1%
84321
< 0.1%
73751
< 0.1%
70801
< 0.1%
68821
< 0.1%
67231
< 0.1%
64981
< 0.1%
64171
< 0.1%

vote_average
Real number (ℝ≥0)

Distinct72
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.974921774
Minimum1.5
Maximum9.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:25.263315image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.5
5-th percentile4.4
Q15.4
median6
Q36.6
95-th percentile7.4
Maximum9.2
Range7.7
Interquartile range (IQR)1.2

Descriptive statistics

Standard deviation0.9351418153
Coefficient of variation (CV)0.1565111395
Kurtosis0.5435032543
Mean5.974921774
Median Absolute Deviation (MAD)0.6
Skewness-0.4359079754
Sum64923.5
Variance0.8744902148
MonotonicityNot monotonic
2022-10-07T16:38:25.437183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.1496
 
4.6%
6495
 
4.6%
5.8486
 
4.5%
5.9473
 
4.4%
6.2464
 
4.3%
6.3461
 
4.2%
6.5457
 
4.2%
6.4446
 
4.1%
5.7415
 
3.8%
6.6413
 
3.8%
Other values (62)6260
57.6%
ValueCountFrequency (%)
1.52
 
< 0.1%
21
 
< 0.1%
2.13
< 0.1%
2.23
< 0.1%
2.32
 
< 0.1%
2.47
0.1%
2.52
 
< 0.1%
2.63
< 0.1%
2.73
< 0.1%
2.87
0.1%
ValueCountFrequency (%)
9.21
 
< 0.1%
8.91
 
< 0.1%
8.82
 
< 0.1%
8.71
 
< 0.1%
8.61
 
< 0.1%
8.56
 
0.1%
8.410
0.1%
8.310
0.1%
8.26
 
0.1%
8.116
0.1%

release_year
Real number (ℝ≥0)

HIGH CORRELATION

Distinct56
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2001.322658
Minimum1960
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:25.607687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1960
5-th percentile1973
Q11995
median2006
Q32011
95-th percentile2015
Maximum2015
Range55
Interquartile range (IQR)16

Descriptive statistics

Standard deviation12.81294057
Coefficient of variation (CV)0.006402236302
Kurtosis0.8000513183
Mean2001.322658
Median Absolute Deviation (MAD)7
Skewness-1.204254294
Sum21746372
Variance164.1714461
MonotonicityNot monotonic
2022-10-07T16:38:25.763887image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014700
 
6.4%
2013659
 
6.1%
2015629
 
5.8%
2012588
 
5.4%
2011540
 
5.0%
2009533
 
4.9%
2008496
 
4.6%
2010490
 
4.5%
2007438
 
4.0%
2006408
 
3.8%
Other values (46)5385
49.6%
ValueCountFrequency (%)
196032
0.3%
196131
0.3%
196232
0.3%
196334
0.3%
196442
0.4%
196535
0.3%
196646
0.4%
196740
0.4%
196839
0.4%
196931
0.3%
ValueCountFrequency (%)
2015629
5.8%
2014700
6.4%
2013659
6.1%
2012588
5.4%
2011540
5.0%
2010490
4.5%
2009533
4.9%
2008496
4.6%
2007438
4.0%
2006408
3.8%

budget_adj
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct2614
Distinct (%)24.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17551039.82
Minimum0
Maximum425000000
Zeros5696
Zeros (%)52.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:25.987146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320853251.08
95-th percentile89375137.94
Maximum425000000
Range425000000
Interquartile range (IQR)20853251.08

Descriptive statistics

Standard deviation34306155.72
Coefficient of variation (CV)1.954650896
Kurtosis13.03695179
Mean17551039.82
Median Absolute Deviation (MAD)0
Skewness3.114919907
Sum1.907095987 × 1011
Variance1.17691232 × 1015
MonotonicityNot monotonic
2022-10-07T16:38:26.296379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05696
52.4%
10164004.3417
 
0.2%
21033371.6517
 
0.2%
2000000016
 
0.1%
4605455.25415
 
0.1%
33496898.6914
 
0.1%
24234951.0614
 
0.1%
20328008.6813
 
0.1%
40656017.3613
 
0.1%
26291714.5713
 
0.1%
Other values (2604)5038
46.4%
ValueCountFrequency (%)
05696
52.4%
0.92109105081
 
< 0.1%
0.96939804261
 
< 0.1%
1.0127866341
 
< 0.1%
1.3090528471
 
< 0.1%
2.9081941281
 
< 0.1%
31
 
< 0.1%
4.5192848051
 
< 0.1%
4.6054552541
 
< 0.1%
5.0066956211
 
< 0.1%
ValueCountFrequency (%)
4250000001
< 0.1%
368371256.21
< 0.1%
315500574.81
< 0.1%
292050672.71
< 0.1%
271692064.21
< 0.1%
271330494.31
< 0.1%
2600000001
< 0.1%
257599886.71
< 0.1%
254100108.51
< 0.1%
250419201.71
< 0.1%

revenue_adj
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct4840
Distinct (%)44.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51364363.25
Minimum0
Maximum2827123750
Zeros6016
Zeros (%)55.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2022-10-07T16:38:26.513519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q333697095.72
95-th percentile276554439
Maximum2827123750
Range2827123750
Interquartile range (IQR)33697095.72

Descriptive statistics

Standard deviation144632485
Coefficient of variation (CV)2.815813842
Kurtosis63.37990799
Mean51364363.25
Median Absolute Deviation (MAD)0
Skewness6.251202093
Sum5.581251711 × 1011
Variance2.091855573 × 1016
MonotonicityNot monotonic
2022-10-07T16:38:26.732209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06016
55.4%
14389144.832
 
< 0.1%
57667591.032
 
< 0.1%
10000002
 
< 0.1%
209354710.52
 
< 0.1%
29106404.282
 
< 0.1%
81036423.442
 
< 0.1%
9670002
 
< 0.1%
89906740.122
 
< 0.1%
317214592
 
< 0.1%
Other values (4830)4832
44.5%
ValueCountFrequency (%)
06016
55.4%
2.370705291
 
< 0.1%
2.8619337341
 
< 0.1%
3.0383599011
 
< 0.1%
5.9267632241
 
< 0.1%
6.9510836951
 
< 0.1%
8.5858012031
 
< 0.1%
9.056819771
 
< 0.1%
9.1150797041
 
< 0.1%
101
 
< 0.1%
ValueCountFrequency (%)
28271237501
< 0.1%
27897122421
< 0.1%
25064057351
< 0.1%
21673249011
< 0.1%
19070058421
< 0.1%
19027231301
< 0.1%
17916943091
< 0.1%
15830495361
< 0.1%
15748147401
< 0.1%
14431914351
< 0.1%

Interactions

2022-10-07T16:38:17.700662image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:04.748707image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:06.623458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.981291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:09.310302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:10.811004image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:12.127983image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:13.451058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.982384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:16.341377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:17.826185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:05.405977image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:06.763607image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:08.112711image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:09.451619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:10.935176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:12.263965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:13.595349image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:15.130679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:16.467291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:17.980939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:05.528690image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:06.888522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:08.248463image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:09.716314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.060683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:12.387510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:13.732048image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:15.263846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:16.592036image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:18.121527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:05.669892image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.036282image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:08.373552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:09.842096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.201248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:12.529109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.012486image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:15.388458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:16.716552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:18.389715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:05.811480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.170683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:08.514063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:09.989090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.325218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:12.653937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.148136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:15.529599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:16.866063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:18.512307image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:05.935284image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.294549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:08.638319image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:10.107762image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.450188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:12.780080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.279610image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:15.669834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:17.028921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:18.652979image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:06.064420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.435614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:08.763285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:10.263802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.587126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:12.926318image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.419957image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:15.794873image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:17.160977image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:18.794682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:06.245436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.590224image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:08.900075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:10.404272image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.722800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:13.060251image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.560094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:15.935103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:17.294371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:18.935810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:06.387542image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.715729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:09.013651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:10.544679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.855655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:13.185172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.700965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:16.060922image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:17.439161image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:19.060314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:06.497905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:07.856321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:09.170033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:10.669601image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:11.993397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:13.309943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:14.857866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:16.196230image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-10-07T16:38:17.561055image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2022-10-07T16:38:26.882639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-07T16:38:27.106650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-07T16:38:27.326536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-07T16:38:27.528311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-07T16:38:19.357373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-07T16:38:19.763580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-10-07T16:38:20.091734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-10-07T16:38:20.326412image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

idimdb_idpopularitybudgetrevenueoriginal_titlecasthomepagedirectortaglinekeywordsoverviewruntimegenresproduction_companiesrelease_datevote_countvote_averagerelease_yearbudget_adjrevenue_adj
0135397tt036961032.9857631500000001513528810Jurassic WorldChris Pratt|Bryce Dallas Howard|Irrfan Khan|Vincent D'Onofrio|Nick Robinsonhttp://www.jurassicworld.com/Colin TrevorrowThe park is open.monster|dna|tyrannosaurus rex|velociraptor|islandTwenty-two years after the events of Jurassic Park, Isla Nublar now features a fully functioning dinosaur theme park, Jurassic World, as originally envisioned by John Hammond.124Action|Adventure|Science Fiction|ThrillerUniversal Studios|Amblin Entertainment|Legendary Pictures|Fuji Television Network|Dentsu6/9/1555626.520151.379999e+081.392446e+09
176341tt139219028.419936150000000378436354Mad Max: Fury RoadTom Hardy|Charlize Theron|Hugh Keays-Byrne|Nicholas Hoult|Josh Helmanhttp://www.madmaxmovie.com/George MillerWhat a Lovely Day.future|chase|post-apocalyptic|dystopia|australiaAn apocalyptic story set in the furthest reaches of our planet, in a stark desert landscape where humanity is broken, and most everyone is crazed fighting for the necessities of life. Within this world exist two rebels on the run who just might be able to restore order. There's Max, a man of action and a man of few words, who seeks peace of mind following the loss of his wife and child in the aftermath of the chaos. And Furiosa, a woman of action and a woman who believes her path to survival may be achieved if she can make it across the desert back to her childhood homeland.120Action|Adventure|Science Fiction|ThrillerVillage Roadshow Pictures|Kennedy Miller Productions5/13/1561857.120151.379999e+083.481613e+08
2262500tt290844613.112507110000000295238201InsurgentShailene Woodley|Theo James|Kate Winslet|Ansel Elgort|Miles Tellerhttp://www.thedivergentseries.movie/#insurgentRobert SchwentkeOne Choice Can Destroy Youbased on novel|revolution|dystopia|sequel|dystopic futureBeatrice Prior must confront her inner demons and continue her fight against a powerful alliance which threatens to tear her society apart.119Adventure|Science Fiction|ThrillerSummit Entertainment|Mandeville Films|Red Wagon Entertainment|NeoReel3/18/1524806.320151.012000e+082.716190e+08
3140607tt248849611.1731042000000002068178225Star Wars: The Force AwakensHarrison Ford|Mark Hamill|Carrie Fisher|Adam Driver|Daisy Ridleyhttp://www.starwars.com/films/star-wars-episode-viiJ.J. AbramsEvery generation has a story.android|spaceship|jedi|space opera|3dThirty years after defeating the Galactic Empire, Han Solo and his allies face a new threat from the evil Kylo Ren and his army of Stormtroopers.136Action|Adventure|Science Fiction|FantasyLucasfilm|Truenorth Productions|Bad Robot12/15/1552927.520151.839999e+081.902723e+09
4168259tt28208529.3350141900000001506249360Furious 7Vin Diesel|Paul Walker|Jason Statham|Michelle Rodriguez|Dwayne Johnsonhttp://www.furious7.com/James WanVengeance Hits Homecar race|speed|revenge|suspense|carDeckard Shaw seeks revenge against Dominic Toretto and his family for his comatose brother.137Action|Crime|ThrillerUniversal Pictures|Original Film|Media Rights Capital|Dentsu|One Race Films4/1/1529477.320151.747999e+081.385749e+09
5281957tt16632029.110700135000000532950503The RevenantLeonardo DiCaprio|Tom Hardy|Will Poulter|Domhnall Gleeson|Paul Andersonhttp://www.foxmovies.com/movies/the-revenantAlejandro González Iñárritu(n. One who has returned, as if from the dead.)father-son relationship|rape|based on novel|mountains|winterIn the 1820s, a frontiersman, Hugh Glass, sets out on a path of vengeance against those who left him for dead after a bear mauling.156Western|Drama|Adventure|ThrillerRegency Enterprises|Appian Way|CatchPlay|Anonymous Content|New Regency Pictures12/25/1539297.220151.241999e+084.903142e+08
687101tt13401388.654359155000000440603537Terminator GenisysArnold Schwarzenegger|Jason Clarke|Emilia Clarke|Jai Courtney|J.K. Simmonshttp://www.terminatormovie.com/Alan TaylorReset the futuresaving the world|artificial intelligence|cyborg|killer robot|futureThe year is 2029. John Connor, leader of the resistance continues the war against the machines. At the Los Angeles offensive, John's fears of the unknown future begin to emerge when TECOM spies reveal a new plot by SkyNet that will attack him from both fronts; past and future, and will ultimately change warfare forever.125Science Fiction|Action|Thriller|AdventureParamount Pictures|Skydance Productions6/23/1525985.820151.425999e+084.053551e+08
7286217tt36593887.667400108000000595380321The MartianMatt Damon|Jessica Chastain|Kristen Wiig|Jeff Daniels|Michael Peñahttp://www.foxmovies.com/movies/the-martianRidley ScottBring Him Homebased on novel|mars|nasa|isolation|botanistDuring a manned mission to Mars, Astronaut Mark Watney is presumed dead after a fierce storm and left behind by his crew. But Watney has survived and finds himself stranded and alone on the hostile planet. With only meager supplies, he must draw upon his ingenuity, wit and spirit to subsist and find a way to signal to Earth that he is alive.141Drama|Adventure|Science FictionTwentieth Century Fox Film Corporation|Scott Free Productions|Mid Atlantic Films|International Traders|TSG Entertainment9/30/1545727.620159.935996e+075.477497e+08
8211672tt22936407.404165740000001156730962MinionsSandra Bullock|Jon Hamm|Michael Keaton|Allison Janney|Steve Cooganhttp://www.minionsmovie.com/Kyle Balda|Pierre CoffinBefore Gru, they had a history of bad bossesassistant|aftercreditsstinger|duringcreditsstinger|evil mastermind|minionsMinions Stuart, Kevin and Bob are recruited by Scarlet Overkill, a super-villain who, alongside her inventor husband Herb, hatches a plot to take over the world.91Family|Animation|Adventure|ComedyUniversal Pictures|Illumination Entertainment6/17/1528936.520156.807997e+071.064192e+09
9150540tt20966736.326804175000000853708609Inside OutAmy Poehler|Phyllis Smith|Richard Kind|Bill Hader|Lewis Blackhttp://movies.disney.com/inside-outPete DocterMeet the little voices inside your head.dream|cartoon|imaginary friend|animation|kidGrowing up can be a bumpy road, and it's no exception for Riley, who is uprooted from her Midwest life when her father starts a new job in San Francisco. Like all of us, Riley is guided by her emotions - Joy, Fear, Anger, Disgust and Sadness. The emotions live in Headquarters, the control center inside Riley's mind, where they help advise her through everyday life. As Riley and her emotions struggle to adjust to a new life in San Francisco, turmoil ensues in Headquarters. Although Joy, Riley's main and most important emotion, tries to keep things positive, the emotions conflict on how best to navigate a new city, house and school.94Comedy|Animation|FamilyWalt Disney Pictures|Pixar Animation Studios|Walt Disney Studios Motion Pictures6/9/1539358.020151.609999e+087.854116e+08

Last rows

idimdb_idpopularitybudgetrevenueoriginal_titlecasthomepagedirectortaglinekeywordsoverviewruntimegenresproduction_companiesrelease_datevote_countvote_averagerelease_yearbudget_adjrevenue_adj
1085620277tt00611350.14093400The Ugly DachshundDean Jones|Suzanne Pleshette|Charles Ruggles|Kelly Thordsen|Parley BaerNaNNorman TokarA HAPPY HONEYMOON GOES TO THE DOGS!...When a Great Dane disguised as a Dachsie crashes the party!great dane|dachshundThe Garrisons (Dean Jones and Suzanne Pleshette) are the "proud parents" of three adorable dachshund pups -- and one overgrown Great Dane named Brutus, who nevertheless thinks of himself as a dainty dachsie. His identity crisis results in an uproarious series of household crises that reduce the Garrisons' house to shambles -- and viewers to howls of laughter!93Comedy|Drama|FamilyWalt Disney Pictures2/16/66145.719660.0000000.0
108575921tt00607480.13137800Nevada SmithSteve McQueen|Karl Malden|Brian Keith|Arthur Kennedy|Suzanne PleshetteNaNHenry HathawaySome called him savage- and some called him saint... some felt his hate- and one found his love... and three had to die...repayment|revenge|native american|wild west|half breedNevada Smith is the young son of an Indian mother and white father. When his father is killed by three men over gold, Nevada sets out to find them and kill them. The boy is taken in by a gun merchant. The gun merchant shows him how to shoot and to shoot on time and correct.128Action|WesternParamount Pictures|Solar Productions|Embassy Pictures6/10/66105.919660.0000000.0
1085831918tt00609210.31782400The Russians Are Coming, The Russians Are ComingCarl Reiner|Eva Marie Saint|Alan Arkin|Brian Keith|Paul FordNaNNorman JewisonIT'S A PLOT! ...to make the world die laughing!!cold war|russian|new englandWithout hostile intent, a Soviet sub runs aground off New England. Men are sent for a boat, but many villagers go into a tizzy, risking bloodshed.126Comedy|WarThe Mirisch Corporation5/25/66115.519660.0000000.0
1085920620tt00609550.08907200SecondsRock Hudson|Salome Jens|John Randolph|Will Geer|Jeff CoreyNaNJohn FrankenheimerNaNplastic surgery|suspenseA secret organisation offers wealthy people a second chance at life. The customer picks out someone they want to be and the organisation surgically alters the customer to look like the intended person, stages the customer's death, gets rid of the intended person and the customer takes on a new life.100Mystery|Science Fiction|Thriller|DramaGibraltar Productions|Joel Productions|John Frankenheimer Productions Inc.10/5/66226.619660.0000000.0
108605060tt00602140.08703400Carry On Screaming!Kenneth Williams|Jim Dale|Harry H. Corbett|Joan Sims|Charles HawtreyNaNGerald ThomasCarry On Screaming with the Hilarious CARRY ON Gang!!monster|carry on|horror spoofThe sinister Dr Watt has an evil scheme going. He's kidnapping beautiful young women and turning them into mannequins to sell to local stores. Fortunately for Dr Watt, Detective-Sergeant Bung is on the case, and he doesn't have a clue! In this send up of the Hammer Horror movies, there are send-ups of all the horror greats from Frankenstein to Dr Jekyl and Mr Hyde.87ComedyPeter Rogers Productions|Anglo-Amalgamated Film Distributors5/20/66137.019660.0000000.0
1086121tt00603710.08059800The Endless SummerMichael Hynson|Robert August|Lord 'Tally Ho' Blears|Bruce Brown|Chip FitzwaterNaNBruce BrownNaNsurfer|surfboard|surfingThe Endless Summer, by Bruce Brown, is one of the first and most influential surf movies of all times. The film documents American surfers Mike Hynson and Robert August as they travel the world during California’s winter (which back in 1965 was off-season for surfing) in search of the perfect wave and an endless summer.95DocumentaryBruce Brown Films6/15/66117.419660.0000000.0
1086220379tt00604720.06554300Grand PrixJames Garner|Eva Marie Saint|Yves Montand|ToshirÅ Mifune|Brian BedfordNaNJohn FrankenheimerCinerama sweeps YOU into a drama of speed and spectacle!car race|racing|formula 1Grand Prix driver Pete Aron is fired by his team after a crash at Monaco that injures his teammate, Scott Stoddard. While Stoddard struggles to recover, Aron begins to drive for another team, and starts dating Stoddard's wife.176Action|Adventure|DramaCherokee Productions|Joel Productions|Douglas & Lewis Productions12/21/66205.719660.0000000.0
1086339768tt00601610.06514100Beregis AvtomobilyaInnokentiy Smoktunovskiy|Oleg Efremov|Georgi Zhzhyonov|Olga Aroseva|Lyubov DobrzhanskayaNaNEldar RyazanovNaNcar|trolley|stealing carAn insurance agent who moonlights as a carthief steals cars various crooks and never from the common people. He sells the stolen cars and gives the money to charity. His best friend, a cop, is assigned to bring in this modern robin hood.94Mystery|ComedyMosfilm1/1/66116.519660.0000000.0
1086421449tt00611770.06431700What's Up, Tiger Lily?Tatsuya Mihashi|Akiko Wakabayashi|Mie Hama|John Sebastian|Tadao NakamaruNaNWoody AllenWOODY ALLEN STRIKES BACK!spoofIn comic Woody Allen's film debut, he took the Japanese action film "International Secret Police: Key of Keys" and re-dubbed it, changing the plot to make it revolve around a secret egg salad recipe.80Action|ComedyBenedict Pictures Corp.11/2/66225.419660.0000000.0
1086522293tt00606660.035919190000Manos: The Hands of FateHarold P. Warren|Tom Neyman|John Reynolds|Diane Mahree|Stephanie NielsonNaNHarold P. WarrenIt's Shocking! It's Beyond Your Imagination!fire|gun|drive|sacrifice|flashlightA family gets lost on the road and stumbles upon a hidden, underground, devil-worshiping cult led by the fearsome Master and his servant Torgo.74HorrorNorm-Iris11/15/66151.51966127642.2791540.0

Duplicate rows

Most frequently occurring

idimdb_idpopularitybudgetrevenueoriginal_titlecasthomepagedirectortaglinekeywordsoverviewruntimegenresproduction_companiesrelease_datevote_countvote_averagerelease_yearbudget_adjrevenue_adj# duplicates
042194tt04119510.5964330000000967000TEKKENJon Foo|Kelly Overton|Cary-Hiroyuki Tagawa|Ian Anthony Dale|Luke GossNaNDwight H. LittleSurvival is no gamemartial arts|dystopia|based on video game|martial arts tournamentIn the year of 2039, after World Wars destroy much of the civilization as we know it, territories are no longer run by governments, but by corporations; the mightiest of which is the Mishima Zaibatsu. In order to placate the seething masses of this dystopia, Mishima sponsors Tekken, a tournament in which fighters battle until only one is left standing.92Crime|Drama|Action|Thriller|Science FictionNamco|Light Song Films3/20/101105.0201030000000.0967000.02